continuous observation
Predicting path-dependent processes by deep learning
In this paper, we investigate a deep learning method for predicting path-dependent processes based on discretely observed historical information. This method is implemented by considering the prediction as a nonparametric regression and obtaining the regression function through simulated samples and deep neural networks. When applying this method to fractional Brownian motion and the solutions of some stochastic differential equations driven by it, we theoretically proved that the $L_2$ errors converge to 0, and we further discussed the scope of the method. With the frequency of discrete observations tending to infinity, the predictions based on discrete observations converge to the predictions based on continuous observations, which implies that we can make approximations by the method. We apply the method to the fractional Brownian motion and the fractional Ornstein-Uhlenbeck process as examples. Comparing the results with the theoretical optimal predictions and taking the mean square error as a measure, the numerical simulations demonstrate that the method can generate accurate results. We also analyze the impact of factors such as prediction period, Hurst index, etc. on the accuracy.
Optimistic MLE -- A Generic Model-based Algorithm for Partially Observable Sequential Decision Making
Liu, Qinghua, Netrapalli, Praneeth, Szepesvári, Csaba, Jin, Chi
This paper introduces a simple efficient learning algorithms for general sequential decision making. The algorithm combines Optimism for exploration with Maximum Likelihood Estimation for model estimation, which is thus named OMLE. We prove that OMLE learns the near-optimal policies of an enormously rich class of sequential decision making problems in a polynomial number of samples. This rich class includes not only a majority of known tractable model-based Reinforcement Learning (RL) problems (such as tabular MDPs, factored MDPs, low witness rank problems, tabular weakly-revealing/observable POMDPs and multi-step decodable POMDPs), but also many new challenging RL problems especially in the partially observable setting that were not previously known to be tractable. Notably, the new problems addressed by this paper include (1) observable POMDPs with continuous observation and function approximation, where we achieve the first sample complexity that is completely independent of the size of observation space; (2) well-conditioned low-rank sequential decision making problems (also known as Predictive State Representations (PSRs)), which include and generalize all known tractable POMDP examples under a more intrinsic representation; (3) general sequential decision making problems under SAIL condition, which unifies our existing understandings of model-based RL in both fully observable and partially observable settings. SAIL condition is identified by this paper, which can be viewed as a natural generalization of Bellman/witness rank to address partial observability. This paper also presents a reward-free variant of OMLE algorithm, which learns approximate dynamic models that enable the computation of near-optimal policies for all reward functions simultaneously.
Inference of collective Gaussian hidden Markov models
We consider inference problems for a class of continuous state collective hidden Markov models, where the data is recorded in aggregate (collective) form generated by a large population of individuals following the same dynamics. We propose an aggregate inference algorithm called collective Gaussian forward-backward algorithm, extending recently proposed Sinkhorn belief propagation algorithm to models characterized by Gaussian densities. Our algorithm enjoys convergence guarantee. In addition, it reduces to the standard Kalman filter when the observations are generated by a single individual. The efficacy of the proposed algorithm is demonstrated through multiple experiments.
Filtering for Aggregate Hidden Markov Models with Continuous Observations
Zhang, Qinsheng, Singh, Rahul, Chen, Yongxin
We consider a class of filtering problems for large populations where each individual is modeled by the same hidden Markov model (HMM). In this paper, we focus on aggregate inference problems in HMMs with discrete state space and continuous observation space. The continuous observations are aggregated in a way such that the individuals are indistinguishable from measurements. We propose an aggregate inference algorithm called continuous observation collective forward-backward algorithm. It extends the recently proposed collective forward-backward algorithm for aggregate inference in HMMs with discrete observations to the case of continuous observations. The efficacy of this algorithm is illustrated through several numerical experiments.
Symbolic Dynamic Programming for Continuous State and Observation POMDPs
Zamani, Zahra, Sanner, Scott, Poupart, Pascal, Kersting, Kristian
Partially-observable Markov decision processes (POMDPs) provide a powerful model for real-world sequential decision-making problems. In recent years, point- based value iteration methods have proven to be extremely effective techniques for finding (approximately) optimal dynamic programming solutions to POMDPs when an initial set of belief states is known. However, no point-based work has provided exact point-based backups for both continuous state and observation spaces, which we tackle in this paper. Our key insight is that while there may be an infinite number of possible observations, there are only a finite number of observation partitionings that are relevant for optimal decision-making when a finite, fixed set of reachable belief states is known. To this end, we make two important contributions: (1) we show how previous exact symbolic dynamic pro- gramming solutions for continuous state MDPs can be generalized to continu- ous state POMDPs with discrete observations, and (2) we show how this solution can be further extended via recently developed symbolic methods to continuous state and observations to derive the minimal relevant observation partitioning for potentially correlated, multivariate observation spaces. We demonstrate proof-of- concept results on uni- and multi-variate state and observation steam plant control.